智能论文笔记

NeRF in the Dark: High Dynamic Range View Synthesis from Noisy Raw Images

Ben Mildenhall , Peter Hedman , Ricardo Martin-Brualla , Pratul Srinivasan , Jonathan T. Barron

分类：计算机视觉

2021-11-26

神经辐射字段（NERF）是一种用于高质量新颖观看综合的技术从一系列姿势输入图像。与大多数视图合成方法一样，NERF使用TONEMAPPED的低动态范围（LDR）作为输入;这些图像已经通过流畅的相机管道处理，平滑细节，剪辑突出显示，并扭曲了原始传感器数据的简单噪声分布。我们修改NERF以直接在线性原始图像直接培训，保持场景的完整动态范围。通过从生成的NERF渲染原始输出图像，我们可以执行新颖的高动态范围（HDR）视图综合任务。除了改变相机的观点外，我们还可以在事实之后操纵焦点，曝光和调度率。虽然单个原始图像显然比后处理的原始图像显着更大，但我们表明NERF对原始噪声的零平均分布非常强大。当优化许多嘈杂的原始输入（25-200）时，NERF会产生一个场景表示，如此准确的，即其呈现的新颖视图优于在同一宽基线输入图像上运行的专用单个和多像深生物丹机。因此，我们调用Rawnerf的方法可以从近黑暗中捕获的极其嘈杂的图像中重建场景。

translated by 谷歌翻译

Advances in Neural Rendering

Ayush Tewari , Justus Thies , Ben Mildenhall , Pratul Srinivasan , Edgar Tretschk , Yifan Wang , Christoph Lassner , Vincent Sitzmann , Ricardo Martin-Brualla , Stephen Lombardi

分类：计算机视觉

2021-11-10

综合照片 - 现实图像和视频是计算机图形的核心，并且是几十年的研究焦点。传统上，使用渲染算法（如光栅化或射线跟踪）生成场景的合成图像，其将几何形状和材料属性的表示为输入。统称，这些输入定义了实际场景和呈现的内容，并且被称为场景表示（其中场景由一个或多个对象组成）。示例场景表示是具有附带纹理的三角形网格（例如，由艺术家创建），点云（例如，来自深度传感器），体积网格（例如，来自CT扫描）或隐式曲面函数（例如，截短的符号距离）字段）。使用可分辨率渲染损耗的观察结果的这种场景表示的重建被称为逆图形或反向渲染。神经渲染密切相关，并将思想与经典计算机图形和机器学习中的思想相结合，以创建用于合成来自真实观察图像的图像的算法。神经渲染是朝向合成照片现实图像和视频内容的目标的跨越。近年来，我们通过数百个出版物显示了这一领域的巨大进展，这些出版物显示了将被动组件注入渲染管道的不同方式。这种最先进的神经渲染进步的报告侧重于将经典渲染原则与学习的3D场景表示结合的方法，通常现在被称为神经场景表示。这些方法的一个关键优势在于它们是通过设计的3D-一致，使诸如新颖的视点合成捕获场景的应用。除了处理静态场景的方法外，我们还涵盖了用于建模非刚性变形对象的神经场景表示...

translated by 谷歌翻译

Neural RGB-D Surface Reconstruction

Dejan Azinović , Ricardo Martin-Brualla , Dan B Goldman , Matthias Nießner , Justus Thies

分类：计算机视觉

2021-04-09

获取房间规模场景的高质量3D重建对于即将到来的AR或VR应用是至关重要的。这些范围从混合现实应用程序进行电话会议，虚拟测量，虚拟房间刨，到机器人应用。虽然使用神经辐射场（NERF）的基于卷的视图合成方法显示有希望再现对象或场景的外观，但它们不会重建实际表面。基于密度的表面的体积表示在使用行进立方体提取表面时导致伪影，因为在优化期间，密度沿着射线累积，并且不在单个样本点处于隔离点。我们建议使用隐式函数（截短的签名距离函数）来代表表面来代表表面。我们展示了如何在NERF框架中纳入此表示，并将其扩展为使用来自商品RGB-D传感器的深度测量，例如Kinect。此外，我们提出了一种姿势和相机细化技术，可提高整体重建质量。相反，与集成NERF的深度前瞻性的并发工作，其专注于新型视图合成，我们的方法能够重建高质量的韵律3D重建。

translated by 谷歌翻译

Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields

Jonathan T. Barron , Ben Mildenhall , Matthew Tancik , Peter Hedman , Ricardo Martin-Brualla , Pratul P. Srinivasan

分类：

2021-03-24

The rendering procedure used by neural radiance fields (NeRF) samples a scene with a single ray per pixel and may therefore produce renderings that are excessively blurred or aliased when training or testing images observe scene content at different resolutions. The straightforward solution of supersampling by rendering with multiple rays per pixel is impractical for NeRF, because rendering each ray requires querying a multilayer perceptron hundreds of times. Our solution, which we call "mip-NeRF" (à la "mipmap"), extends NeRF to represent the scene at a continuously-valued scale. By efficiently rendering anti-aliased conical frustums instead of rays, mip-NeRF reduces objectionable aliasing artifacts and significantly improves NeRF's ability to represent fine details, while also being 7% faster than NeRF and half the size. Compared to NeRF, mip-NeRF reduces average error rates by 17% on the dataset presented with NeRF and by 60% on a challenging multiscale variant of that dataset that we present. Mip-NeRF is also able to match the accuracy of a brute-force supersampled NeRF on our multiscale dataset while being 22× faster.

translated by 谷歌翻译

IBRNet: Learning Multi-View Image-Based Rendering

Qianqian Wang , Zhicheng Wang , Kyle Genova , Pratul Srinivasan , Howard Zhou , Jonathan T. Barron , Ricardo Martin-Brualla , Noah Snavely , Thomas Funkhouser

分类：

2021-02-25

We present a method that synthesizes novel views of complex scenes by interpolating a sparse set of nearby views. The core of our method is a network architecture that includes a multilayer perceptron and a ray transformer that estimates radiance and volume density at continuous 5D locations (3D spatial locations and 2D viewing directions), drawing appearance information on the fly from multiple source views. By drawing on source views at render time, our method hearkens back to classic work on image-based rendering (IBR), and allows us to render high-resolution imagery. Unlike neural scene representation work that optimizes per-scene functions for rendering, we learn a generic view interpolation function that generalizes to novel scenes. We render images using classic volume rendering, which is fully differentiable and allows us to train using only multiview posed images as supervision. Experiments show that our method outperforms recent novel view synthesis methods that also seek to generalize to novel scenes. Further, if fine-tuned on each scene, our method is competitive with state-of-the-art single-scene neural rendering methods. 1

translated by 谷歌翻译

Time-Travel Rephotography

Xuan Luo , Xuaner Zhang , Paul Yoo , Ricardo Martin-Brualla , Jason Lawrence , Steven M. Seitz

分类：计算机视觉

2020-12-22

许多历史人民曾经被旧的，褪色，黑白照片捕获，这是由于早期摄像机的局限性和时间的流逝而被扭曲。本文模拟了与现代相机回到的时间，以重新摄像机。与使用独立操作的传统图像恢复过滤器不同，如去噪，着色和超级度量，我们利用Stylegan2框架将旧照片投影到现代高分辨率照片的空间中，在统一的框架中实现所有这些效果。这种方法的独特挑战是在原始照片中保留了主题的身份和姿势，同时丢弃了在低质量古董照片中经常看到的许多伪影。我们对目前最先进的恢复过滤器的比较显示出各种重要历史人民的重大改进和引人注目的结果。

translated by 谷歌翻译

NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections

Ricardo Martin-Brualla , Noha Radwan , Mehdi S. M. Sajjadi , Jonathan T. Barron , Alexey Dosovitskiy , Daniel Duckworth

分类：

2020-08-05

We present a learning-based method for synthesizing novel views of complex scenes using only unstructured collections of in-the-wild photographs. We build on Neural Radiance Fields (NeRF), which uses the weights of a multilayer perceptron to model the density and color of a scene as a function of 3D coordinates. While NeRF works well on images of static subjects captured under controlled settings, it is incapable of modeling many ubiquitous, real-world phenomena in uncontrolled images, such as variable illumination or transient occluders. We introduce a series of extensions to NeRF to address these issues, thereby enabling accurate reconstructions from unstructured image collections taken from the internet. We apply our system, dubbed NeRF-W, to internet photo collections of famous landmarks, and demonstrate temporally consistent novel view renderings that are significantly closer to photorealism than the prior state of the art.

translated by 谷歌翻译

Assessment of creditworthiness models privacy-preserving training with synthetic data

Ricardo Muñoz-Cancino , Cristián Bravo , Sebastián A. Ríos , Manuel Graña

分类：机器学习

2022-12-31

Credit scoring models are the primary instrument used by financial institutions to manage credit risk. The scarcity of research on behavioral scoring is due to the difficult data access. Financial institutions have to maintain the privacy and security of borrowers' information refrain them from collaborating in research initiatives. In this work, we present a methodology that allows us to evaluate the performance of models trained with synthetic data when they are applied to real-world data. Our results show that synthetic data quality is increasingly poor when the number of attributes increases. However, creditworthiness assessment models trained with synthetic data show a reduction of 3\% of AUC and 6\% of KS when compared with models trained with real data. These results have a significant impact since they encourage credit risk investigation from synthetic data, making it possible to maintain borrowers' privacy and to address problems that until now have been hampered by the availability of information.

translated by 谷歌翻译

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译

Transformer-based normative modelling for anomaly detection of early schizophrenia

Pedro F Da Costa , Jessica Dafflon , Sergio Leonardo Mendes , João Ricardo Sato , M. Jorge Cardoso , Robert Leech , Emily JH Jones , Walter H. L. Pinaya

分类：机器学习 | 人工智能

2022-12-08

Despite the impact of psychiatric disorders on clinical health, early-stage diagnosis remains a challenge. Machine learning studies have shown that classifiers tend to be overly narrow in the diagnosis prediction task. The overlap between conditions leads to high heterogeneity among participants that is not adequately captured by classification models. To address this issue, normative approaches have surged as an alternative method. By using a generative model to learn the distribution of healthy brain data patterns, we can identify the presence of pathologies as deviations or outliers from the distribution learned by the model. In particular, deep generative models showed great results as normative models to identify neurological lesions in the brain. However, unlike most neurological lesions, psychiatric disorders present subtle changes widespread in several brain regions, making these alterations challenging to identify. In this work, we evaluate the performance of transformer-based normative models to detect subtle brain changes expressed in adolescents and young adults. We trained our model on 3D MRI scans of neurotypical individuals (N=1,765). Then, we obtained the likelihood of neurotypical controls and psychiatric patients with early-stage schizophrenia from an independent dataset (N=93) from the Human Connectome Project. Using the predicted likelihood of the scans as a proxy for a normative score, we obtained an AUROC of 0.82 when assessing the difference between controls and individuals with early-stage schizophrenia. Our approach surpassed recent normative methods based on brain age and Gaussian Process, showing the promising use of deep generative models to help in individualised analyses.

translated by 谷歌翻译